A Graph Approach to Spelling Correction in Domain-Centric Search
نویسندگان
چکیده
Spelling correction for keyword-search queries is challenging in restricted domains such as personal email (or desktop) search, due to the scarcity of query logs, and due to the specialized nature of the domain. For that task, this paper presents an algorithm that is based on statistics from the corpus data (rather than the query log). This algorithm, which employs a simple graph-based approach, can incorporate different types of data sources with different levels of reliability (e.g., email subject vs. email body), and can handle complex spelling errors like splitting and merging of words. An experimental study shows the superiority of the algorithm over existing alternatives in the email domain.
منابع مشابه
Improving Accuracy of DGPS Correction Prediction in Position Domain using Radial Basis Function Neural Network Trained by PSO Algorithm
Differential Global Positioning System (DGPS) provides differential corrections for a GPS receiver in order to improve the navigation solution accuracy. DGPS position signals are accurate, but very slow updates. Improving DGPS corrections prediction accuracy has received considerable attention in past decades. In this research work, the Neural Network (NN) based on the Gaussian Radial Basis Fun...
متن کاملSpelling Correction Based on User Search Contextual Analysis and Domain Knowledge
We propose a spelling correction algorithm that combines trusted domain knowledge and query log information for query spelling correction. This algorithm uses query reformulations in the query log and bigram language models built from queries for efficiently and effectively generating correction suggestions and ranking them to find valid corrections. Experimental results show that for both simp...
متن کاملDesign and implementation of Persian spelling detection and correction system based on Semantic
Persian Language has a special feature (grapheme, homophone, and multi-shape clinging characters) in electronic devices. Furthermore, design and implementation of NLP tools for Persian are more challenging than other languages (e.g. English or German). Spelling tools are used widely for editing user texts like emails and text in editors. Also developing Persian tools will provide Persian progr...
متن کاملSpelling Correction as an Iterative Process that Exploits the Collective Knowledge of Web Users
Logs of user queries to an internet search engine provide a large amount of implicit and explicit information about language. In this paper, we investigate their use in spelling correction of search queries, a task which poses many additional challenges beyond the traditional spelling correction problem. We present an approach that uses an iterative transformation of the input query strings int...
متن کاملTowards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011